Picture for Sanmi Koyejo

Sanmi Koyejo

Stanford University

Welfare, Improvability, and Variance: A Principal-Agent Approach to Optimal Benchmark Item Aggregation

Add code
May 29, 2026
Viaarxiv icon

Is Backpropagation Optimal? When Synthetic Gradients Improve Sample Efficiency

Add code
May 27, 2026
Viaarxiv icon

AI Cartography: Mapping the Latent Landscape of AI Benchmark Ecosystems

Add code
May 24, 2026
Viaarxiv icon

General Preference Reinforcement Learning

Add code
May 21, 2026
Viaarxiv icon

The Distillation Game: Adaptive Attacks & Efficient Defenses

Add code
May 21, 2026
Viaarxiv icon

DecodingTrust-Agent Platform (DTap): A Controllable and Interactive Red-Teaming Platform for AI Agents

Add code
May 06, 2026
Viaarxiv icon

Stop Automating Peer Review Without Rigorous Evaluation

Add code
May 04, 2026
Viaarxiv icon

SWE-chat: Coding Agent Interactions From Real Users in the Wild

Add code
Apr 22, 2026
Viaarxiv icon

HealthAdminBench: Evaluating Computer-Use Agents on Healthcare Administration Tasks

Add code
Apr 10, 2026
Viaarxiv icon

SCENEBench: An Audio Understanding Benchmark Grounded in Assistive and Industrial Use Cases

Add code
Mar 10, 2026
Viaarxiv icon